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Abstract 



The purpose of this presentation is to demonstrate a new video evaluation 
instrument designed specifically for distance education. It was designed to be used for 
instructional design consultation, distance education teacher training, or research. 
Categories include students interacting with the teacher, other students, and content. 

Instructional designers in distance education are not always developing instruction, 
they are facilitating the transfer of teachers from a traditional to a high tech setting. In a 
two-way television setting, teachers accustomed to the rich environment of subtle 
communication through body language may find themselves stymied by the cavernous feeling 
of a television classroom. Beyond the worries of the technology and "being on TV," one may 
experience a disconnectedness that makes the meaning of "distance" even more pronounced. 
One obstacle facing teachers is facilitating interaction, an essential component to success in 
traditional and distance classrooms (Fulford & Zhang, 1993b; Garrison, 1990). Traditional 
methods of interaction may not transfer well to television (Garrison, 1990; Moore, 1989). 
Having an on-site group of students may increase the comfort level, but could this greater 
the risk of forgetting the receive-site students? Unfortunately, if the distance learners do not 
perceive acceptable levels of overall interaction, they are less satisfied with the instruction 
(Fulford & Zhang, 1993a). 

The linguistic study of discourse has a rich past (Searle, 1969; Flanders, 1970). For 
the distance educator, a linguistic approach may not be the most expedient. Moore (1989) 
says that interaction "carries so many meanings as to be almost useless unless specific sub 
meanings car be defined and generally agreed upon" (p. 1). He provides a framework for 
studying interaction in distance education suggesting three district, but closely related 
types: learner-to-instructor, learner-to-learner, and learner-to-content. The purpose of this 
research was to create an instrument to quantify and classify interaction in distance 
education using Moore's (1989) three categories. The instrument was designed to be used 
for instructional design consultation, distance education teacher training, and research. 

Johnson (1987) found that teachers could be trained to improve their ability to 
facilitate interaction using video and audio tapes to analyze and categorize their behavior. 
He used Flanders' (1970) model of interaction analysis to develop the instrument. Flanders' 
system of interaction analysis was defined by Verduin (1970) as "the systematic 
quantification of behavioral acts or qualities of behavior [gjg] acts as they occur in some sort 
of spontaneous interaction" (p. 32). "Teacher talk" is divided into seven sub -categories, four 
that are indirect, three direct. The indirect categories are: accepting feelings, praising or 
encouraging, accepting or using ideas of the student, and asking questions. The direct 
categories are: lecturing, giving directions, and criticizing or justifying authority. There are 
only two "student talk" sub-categories: response and initiation. "Silence" is a category 
falling outside the teacher and student domain. Johnson updated the names of some 
categories, but the meaning was essentially the same. 

Changing philosophies of classroom organization are altering the way teachers 
facilitate interaction in the traditional classroom. Hertz-Lazarowitz and Shachar (1990) 
discuss the differences between the traditional "whole class" method versus a "group 
investigation" method. They said that "Flanders and his colleagues assessed teachers' 
verbal behavior mostly in traditional direct-instruction classrooms" (p. 79) and therefore, 
additional categories of behavior needed to be derived from observing cooperative 
classrooms. Altogether, twenty categories were defined from transcripts of cooperative 
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classes: instructing, lecture, short questions to elicit short answers, translation, 
interruption, disciplining one student, disciplining the whole class, disciplining by proxy, 
pluralizing, prompting, mediating, mechanical reinforcement, competitive reinforcement, 
spontaneous reference to children's initiatives, helping the child in the course of learning, 
encouraging interaction among children, referring matter-of-factly to problems of procedure 
and organization, reference to students' performance, individual personal reinforcement, and 
revealing emotions. 

Moore (1989) has a student-centered perspective. The three types he has specified 
all originate from the learner, not the teacher. Johnson's (1987) and Hertz-Lazarowitz and 
Shachar's (1990) instruments are mainly teacher-centered. Both interaction analysis 
instruments to evaluate classroom dialogue are detailed and differentiate between micro- 
levels of teacher behavior. This suggests a major theoretical division in interaction analysis. 
Conventional instruments (Johnson's 1987, Hertz-Lazarowitz, & Shachar, 1990) may have 
been successful in traditional settings for research and training, but, applying them to the 
distance education TV classroom is problematic. The purpose of interaction analysis may be 
different. Johnson (1985) and Hertz-Lazarowitz and Shachar (1990) focus on "functional" 
analysis, while Moore's framework focuses on the "parties involved". Interaction analysis in 
the traditional classroom is often for the purpose of training new teachers (Johnson, 1987). 
In many distance education TV settings, veterans already have teaching skills, so the 
emphasis is on adjusting to the two-way capabilities of the television classroom. The 
dilemma is not so much compelling teachers to interact locally, but to interact with all 
students across the distance. Since both parties are essential to the interaction process, 
this research attempts to give equal care to students and teachers. 

The idea of analyzing interaction in two-way television is valid, but the instrument 
must be suitable for the setting. Hiring trained classroom observers is costly. A workable 
tool has to be simple to use, and require minimal training. Videotaping may be one tool for 
evaluating interaction. However, videotaping alone is not enough, because no analysis can 
be conducted without a systematic, objective-based approach. Simply watching videos of 
themselves in the classroom may alienate "technophobic" teachers. 

The Conceptual Framework 

Some may assume that interaction requires overt speaking behavior. However, 
interaction has also been defined as covert behavior, that is carrying on an internal 
conversation (Kruh & Murphy, 1990). Therefore, it seems important to examine all facets of 
classroom communication in the study of interaction. Fulford (1993), in a model of cognitive 
speed, explains that one-way communications such as television and lectures are half the 
speed of the mind's cognitive capacity. The pace of one-way communication may be too slow 
and distract the learner. Two-way communication requires interaction, keeping the learner's 
mind occupied. Normal speech is usually 125-150 words per minute (wpm), but the 
theoretical cognitive capacity is 250-300 wpm. The model illustrates that this provides 
enough capacity for the speakers to simultaneously to speak and to monitor their delivery, 
and listeners to listen and to prepare responses. Since listening requires only 125-150 
(wpm), "if they aren't engaged in a situation in which they must interact, their renegade 
thought patterns may dominate their cognitive activity" (Fulford & Zhang, 1993a). This 
may explain why anticipated interaction has been linked to positive learner attitudes 
(Yarkin-Levin 1983). 



Flanders' (1970) direct and indirect categories also seem to reflect the idea of one- 
way and two-way communication. "Direct influence ... tends to minimize the freedom of the 
student, because the teacher directs the learning activity. The second factor, indirect 
influence, would have the opposite effect, or that of maximizing the freedom of the student to 
respond" (Verduin, p. 32). 

For this study, Moore's (1989) framework of learner-to-instructor, learner-to-learner, 
and learner-to-content categories were sub-divided into "one-way" communication and "two- 
way" communication using Fulford's model (1993). Since interaction may imply a need for 
overt responses, the word "communication" was used to include both types of classroom 
behavior. One-way communication was considered a uni-directional information flow with a 
passive receiver. It was defined in this study as communication directed at the entire class 
with no expectation of a response, such as, the teacher lecturing or giving directions, a 
student making a presentation, or a pre-recorded video-tape being shown. Two-way 
communication was considered as a multi-directional information flow requiring overt 
interaction and active participation from at least two people. It was defined in this study as 
directed communication with expectation of a response. Similar to Flanders' (1970) indirect 
categories, this included: asking and answering questions, responding with praise, or 
encouraging. 

In a traditional classroom, sub-categories of teacher to student and student to 
student may be sufficient, but in a distance setting it seemed important to analyze 
interaction across all sites. Although, Fulford & Zhang (1993a) indicate that every student 
does not have to participate publicly to enjoy satisfaction from interaction, there could be a 
group identity that says "if my site is not called on, I'm being ignored." For this reason, two- 
way communication was sub-divided into two teacher categories and five student categories. 
The categories differentiated between the person initializing the communication and the 
person responding. Teacher-to-student and student-to-teacher were standard categories. 
The student-to-group category was provided in consideration of collaborative learning 
techniques and to examine how much freedom students have to converse without teacher 
intervention. Teacher-to-specific location and student-to-specific location were designed to 
examine whether distance interaction is occurring that lakes into account group identity. A 
student-to-all category was used to find out if students were encouraged to address everyone 
across the distance, rather than just the teacher. The student-to-content category was to 
examine how much active involvement students have with the instructional materials. Since 
a large part of classroom time could be taken up by management issues, non-instructional 
categories were provided for both one-way and two-way communications. 

Flanders (1970) and Johnson (1987) both required the coding of categories every 
three seconds. Skill training, that lasted several hours, was supposed to acclimate the rater 
to three second intervals. There was no actual timing device used. Analysis was often 
carried out through real-time observation. In this study, video analysis had the advantage 
of allowing the rater to record the exact time a category occurred and to replay the 
instruction to be sure all coding was accurate. This method allowed coding of "events" or 
discrete topics instead of chopping the instruction into three second bits. Each occurrence of 
a category was defined as an event. An event started with the initiation of a new topic. For 
example, a student asked "How do you keep the mouse cage clean?" The teacher responded 
"Does anyone have suggestions?" Another student offered "I use soil instead of wood 
chips..." The teacher praises "Good idea..." When a third student asked "What do you say 
to the children if the mouse dies?" this begins a new topic, therefore a new event. 
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Analyzing interaction by category per event provides a wealth of information about 
the direction and participants of interaction. However, additional information can be 
obtained by examining single transactions. To examine the richness of an event, it is 
important to know how many exchanges occurred between communicators. A single 
exchange, although lasting several minutes is very different from, numerous exchanges over 
the same period. Coding discrete transactions may help determine how many people were 
involved in the instructional event. Exchanges that involve only one teacher and one student 
seem limited, when the goal is to create lively group discussion. This study defined 
transaction as the contribution of a single individual. A series of exchanges or transactions 
between individuals constituted an event. 

Context of the Study 

The two- semester, two-credit university course used in the study provided in-service 
training for the Developmental Approaches in Science and Health (DASH) program. There 
were ten sessions from October 1991 to May 1992. DASH is a sequential kindergarten 
through sixth grade (K-6) program that integrates the content of science, health, and 
technology. The course was offered through the Hawaiian Interactive Television Systera 
(HITS) which is a 4-channel interactive inter-island closed-circuit television network that 
uses both Instructional Television Fixed Service (ITFS) and point-to-point microwave signals 
to connect six classrooms across the state. Instruction was delivered to five receive-site 
classrooms; there were no participants at the origination site. This was the first time this 
course was offered over HITS and the first time these teachers taught via two-way 
television. 

Each session had a similar format. Participants met locally for an hour with a 
DASH facilitator, before the broadcast portion that lasted one hour and fifteen minutes. 
After a brief check-in, a pre-recorded videotape was shown for about twenty minutes. 
Collaborative activities took up approximately fifteen minutes, ending with each location 
presenting their work for three minutes. The panel answered faxed questions at the end of 
the session. 

This course provided the occasion to examine a large number of participants over a 
long period of time. The researchers were not involved in the development or teaching of the 
course. The videos were analyzed independently by four research assistants to prevent 
potential bias. 

Procedures and Methodology 

The participants were K-6 teachers who were already using DASH in their 
classrooms. For most this was their first interactive TV experience. The 233 participants 
were in 5 locations: 98 in 2 two-way audio/one-way video locations, and 135 in 3 two-way 
video/two-way audio locations. The 10 sessions used in the study were recorded at the 
origination site. The videotapes included only the broadcast portions of the course. Due to 
lack of recording equipment, it was not possible to record every site at all times. Each site 
was shown on the screen as they participated. During discussion with one-way video sites, 
a still photograph of the participants was shown while the audio was heard. Collaborative 
activities were the greatest challenge to videotaping. The two-way video sites were scanned 
in sequence while open microphones collected the overall audio activity. Although this 
provided a sampling of what occurred during these activities, these activities were rated as a 
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single event and transactions were not recorded. 

Four graduate students in educational technology were recruited as evaluators. All 
of them had completed instructional design coursework. They were given a description of the 
categories and then shown video-taped examples. They were shown how to code information 
onto the instrument. The training lasted about a half hour. The evaluators were asked not 
to compare their evaluation forms. They commented that the form was straightforward and 
easy to use. The four evaluators each analyzed the video tapes of the 10 sessions. 

The Evaluation Instrument 

The instrument was designed using Moore's (1989) framework for studying 
interaction in distance education. Three categories of interaction were examined: learner- 
instructor, learner-learner, and learner-content. The evaluation instrument was a half page 
form used to record each "event" of the lesson (see Figure 1). An "event" was defined as a 
single topic. For example, the teacher asked a specific question, a student answered, the 
teacher clarified, and another student expanded. This was one event. If the teacher asked 
a new question, or a student changed the focus, a new event began. The coding of the event 
is based on the person who initializes the topic, although other people become involved in 
the communicatioxi. A VCR with a time-based counter was used identify the "beginning" 
and "ending" time of each "event". Both were recorded on the form and a "total" time was 
calculated. Since ten tapes were used in the study, the tape number and event number 
were recorded for tracking. 

The type of interaction was then recorded. "One-way" interaction was defined as 
communication directed at the entire class with no expectation of response. There were four 
"one-way" categories. "Non-instructional" events pertained to the management of the class, 
not the content of the lesson, these included directions about turning in homework, 
purchasing materials, operating equipment, and so forth. "Teacher o*> v " events were lecture 
type events. "Student only" events were presentations made by stud ■ . "Content only" 
was for presentations using media such as pre-recorded video tapes. 
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Figure 1. Video evaluation instrument. 



Beg. Time . 



End Time 



Total 



Tape/Event # . 



One-Way 
Non-Instructional 
Teacher Only 
Student Only 
Content Only 



Type of Interaction 

Two-Way 

Non-Instructional 

Teacher-Student 

Teacher-Spec.Loc. 

Student-Content 



Student-Teacher 
Student-Group 
Student-Spec.Loc. 
Student-All 



Transactions 



UHM LCC KCC Hilo Maui Molokai Kauai Kailua # of people. 



Total . 
Total . 
Total . 
Final 



Notes 



"Two-way" interaction was defined a directed communication with the expectation of 
a response. There were eight categories of "two-way" interaction. "Non-instructional" two- 
way differed from "non-instructional" one-way in that responses were given. Site sign-ons 
are an exampls of a "two-way non-instructional" event. "Teacher-student" was a teacher 
asking a question directed to all students. "Teacher-specific location" was a teacher 
directing a question to a specific location or site. "Student-teacher" wac the student asking 
a question directed to a teacher. "Student-specific location" was a student directing a 
question to a specific location or individual. "Student-group" was a student conversing 
within their own site, including the entire group or groups formed for collaborative activities. 
"Student-content" was a student interacting with course materials that required active 
participation. This category was intended for written activities, reading, or using computer 
assisted instruction, although in this study, none of these events occurred. "Student-all" 
was when a student asked a question for anyone to respond to. 

Next, individual transactions were recorded. This portion of the instrument was 
designed to provide an indication of the richness of the communication. For example, if the 
teacher asked a question, a student provided a short answer, and then the teacher 
expounded for several minutes, the event would be very teacher focused. However, if the 
teacher asked a question, a student provided a short answer, then the teacher asked for 
elaboration, another student responded, and another teacher provided another example, the 
event would be richer and more student focused. These events may take the same amount 
of time, but by recording transactions, an instructional designer could examine patterns of 
interaction. 

For this study, transactions were recorded by using T for teacher and S for student. 
If more than one teacher or student was involved in an event, numbers were added. The 
recording of the first example above would be T S T, the second example would be T S T S2 
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T2. To insure the accuracy of the ratings, evaluators were asked to rewind the tape and 
record lengthy events three times (see the three lines in Figure 1). The number of 
transactions was counted. A final count was determined using the evaluators' best 
judgement which of the three attempts was most accurate. The evaluators then counted 
and recorded the number of people involved in each event and circled the involved sites. 

Reliability 

The video interaction analysis instrument was tried out during the DASH program. 
The raters viewed the video tapes of the ten DASH sessions independently and recorded: 

1. The total number of occurrences for each type of interaction in each session. 

2. The total time spent for each type of interaction in each session. 

3. The total number of events in each session. 

4. The total number of transactions in each session. 

5. The total number of people involved in all the events in each session. 

The information above was used to generate a quantitative summary of the overt 
interaction in the TV classroom. From Item 1, one could see how frequently each type of 
interaction occurred in any session and determine whether the interaction pattern over TV 
was balanced or appropriate. Item 2 showed the proportion of time actually spent on each 
type of interaction, which revealed the time reserved for the particular type of interaction. 
Item 3 showed how many topical segments or events of teacher- student interaction occurred 
(Figure 2). Item 4 showed how many exchanges or transactions occurred (Figure 3). 
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Figure 2 

Total Events 
by Rater by Session 



Events 




S 2 S3 



S4 S5 S6 S7 

Session 
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Figure 3 
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Figure 4 

Total Transactions per Event 



Events 
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Session 
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Figure 5 
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Events 
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Figure 6 

Total People per Event 



Events 




S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 



Session 

Total a Rater 1 ♦ Rater 2 ■ Rater 3 



Items 3 and 4 produced the average number of transactions per event, which was a 
convenient index of the extent to which an instructor allows a topical segment to elapse 
(Figure 4). Item 5 showed the total involvement during the session (Figure 5). This total 
was derived from adding the number of people in each event, therefore, the same individuals 
may be counted a number of times to create this figure. Item 3 and 5, on the other hand, 
showed the avenge number of participants the instructor allowed in an event (Figure 6). A 
highly interactive class would be expected to generate more exchanges and engage more 
students per event than a lecture-type, non-interactive class. 

Aggregate reliability is considered invalid if the pairwise inter-rater reliabilities vary 
a great deal, for instance, between - 0.6 and + 0.6 Aggregate reliability is valid only when 
the pairwise inter-rater reliabilities are similar (Overall, 1965). All the four raters provided 
the total number of occurrences for each type of interaction and the total time actually spent 
on each type of interaction in each session. Their mean pairwise inter-rater reliabilities and 
the aggregate reliabilities on the two variables are reported in Table 1. The reliabilities are 
generally acceptable, except in those categories where few occurrences were observed. Not 
all the categories of interaction are in Table 1, because some types of interaction did not 
take place in the DASH program. The "content only" category had only one occurrence, 
thereby creating a misleading perfect reliability so it was not reported. 

Three raters completed analyzing the events and transactions in the video tapes of 
the 10 DASH sessions. Therefore, the reliabilities reported below in Table 2 were based 
upon three not four raters. The reliabilities reported show that using the mean of three or 
four raters will result in high reliabilities. 
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Table 1 

Reliabilities for total number of occurrences and 
total time spent for each type of interaction 



Mean irter-rater Aggregate 
reliability reliability 



TvDe Sub-cateeorv Number Time Number Time 




One-way 


Non-instructional 


0.38 


0.53 


0.71 


0.82 " 






Teacher only 


0.89 


0.93 


0.97 


0.98 






One-way Total 


0.60 


0.85 


0.86 


0.96 




Two-way 


Non-instructional 


0.72 


0.62 


0.91 


0.88 






Teacher to student 


0.39 


0.33 


0.72 


0.68 






Student to teacher 


0.81 


0.66 


0.94 


0.89 






Student to student 


0.31 


0.91 


0.64 


0.98 






Two-way Total 


0.27 


0.97 


0.61 


0.99 





Table 2 

Reliabilities for total number of events, transactions, and people involved, 
and average number of transactions per event and people per event 







Mean Pairwise 
Inter -rater Reliability 


Aggregate 
Reliability 




Number of 
Events 


0.80 


0.93 






Number of 
Transactions 


0.79 


0.92 






Number of 
People Involved 


0.66 


0.85 






Average Number of 
Transaction per Event 


0.61 


0.82 






Average Number of 
People p£r Event 


0.62 


0.83 





After the four raters had analyzed the video tapes of the 10 DASH sessions, 
pairwise inter-rater reliabilities and aggregate reliabilities were calculated. The pairwise 
inter-rater reliability is simply the correlation between the counts or recorded times of any 
two judges (Sax, 1989). The aggregate of effective reliability is the reliability of the mean of 
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the counts or times independently given by the four raters (Guilford, 1954; Rosenthal, 1987; 
Rosenthal & Rosnow, 1991). The aggregate reliability estimates the proportion of the 
variance in the mean of the four scores that is due to true scores, and its formula follows the 
same logic as the Spearman-Brown prophecy formula (Sax, 1989, pp. 265-266; Rosenthal & 
Rosnow, 1991, pp. 51-54). The aggregate reliability is higher than the mean of pairwise 
inter -rater reliabilities for the simple reason that employing multiple judges and adopting 
their mean is a more reliable scoring procedure than relying on any two potentially 
idiosyncratic individuals serving as raters. 

Discussion 

Episodes that may be categorized into more than one category were sometimes a 
problem. For example, the difference between "non-instructional" and "teacher to student" 
was not always easy to discern. An example from the tape was a conversation that began 
with asking about materials being mailed which is a "non-instructional" event; ended with a 
discussion of how the materials should be used when they arrive, a "teacher to student" 
event. One judge may decide to categorize the event at the beginning of the transaction, 
another at the end, and still another may break the event into two separate events. These 
episodes that do not seem to have pieuhe beginnings and endings may cover more or less 
time allotment units. Such ambiguity is more evident in some categories, such as "two-way 
teacher-to-student" and "two-way student-to-student". It is not clear whether these 
categories are superfluous, or, due to the way this course v/as taught. Categories such as 
this could be grouped together, or eliminated, however, it is possible in some classes the only 
interaction occurring is "non-instructional." In this case, removing the category eliminates 
important information. The findings suggest when training judges, those categories should 
be emphasized. 

Some sub-categories produced little or no data. For this study, "student to specific 
location"; "student to group"; and "student to all" were collapsed into an overall student to 
student sub-category to create more meaningful analysis. "Student to content" had to be 
eliminated for the lack of data. Does it mean these categories are superfluous? Or, is it only 
because of the way this particular course was taught? In the former case, they may be 
eliminated from the instrument or grouped in order not to distract judges. Although the 
rating process is simplified, removing categories reduces detailed information. 

The "content" categories also caused some problems. The episode of watching a 
video-tape was too obvious to really test the extent of the concurrence among the raters. In 
the "two-way, student-to-content" sub-category, no episodes occurred. "Content" events may 
be both difficult to measure and difficult to include in an interactive TV setting. It may be 
considered a waste of costly air time to show video-tapes, have the students read, or 
complete written or computer assisted instruction alone. These activities can be done "off- 
air" while collaborative activities and a discussion of the results "on-air" may increase the 
amount of interaction time available. More research is needed regarding the concept of 
"interacting with content" as proposed by Moore (1989). 

The overall reliability of this instrument is sufficiently high to warrant its use 
analyzing interaction in distance education. Care should be taken not to just consider the 
amount of time spent interacting, but also the richness and patterns of interaction. By 
using this instrument as a consulting tool, instructional designers may be able to help 
instructors improve the quality of interaction across the distance. 
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